MOSAIC: A Proximity Graph Approach for Agglomerative Clustering
نویسندگان
چکیده
Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters they can obtain are limited to convex shapes and clustering results are also highly sensitive to initializations. In this paper, a novel agglomerative clustering algorithm called MOSAIC is proposed which greedily merges neighboring clusters maximizing a given fitness function. MOSAIC uses Gabriel graphs to determine which clusters are neighboring and approximates non-convex shapes as the unions of small clusters that have been computed using a representative-based clustering algorithm. The experimental results show that this technique leads to clusters of higher quality compared to running a representative clustering algorithm standalone. Given a suitable fitness function, MOSAIC is able to detect arbitrary shape clusters. In addition, MOSAIC is capable of dealing with high dimensional data.
منابع مشابه
MOSAIC: Agglomerative Clustering with Gabriel Graphs
Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters they can obtain are limited to convex shapes and clustering results are also highly sensitive to initializations. In this paper, a novel agglomerative clustering algorithm called MOSAIC is proposed which greedily merges ...
متن کاملLEGClust - A Clustering Algorithm Based on Layered Entropic Subgraphs
Hierarchical clustering is a stepwise clustering method usually based on proximity measures between objects or sets of objects from a given data set. The most common proximity measures are distance measures. The derived proximity matrices can be used to build graphs, which provide the basic structure for some clustering methods. We present here a new proximity matrix based on an entropic measur...
متن کاملTCUAP: A Novel Approach of Text Clustering Using Asymmetric Proximity
Text documents have sparse data spaces and current existing methods of text clustering use symmetry proximity to measure the correlation of documents. In this paper, we propose a novel approach to strengthen the discriminative feature of document objects, which uses asymmetric proximity for text clustering. We present a measure of asymmetric proximity between documents and between clusters. TCU...
متن کاملClustering of bipartite advertiser-keyword graph
In this paper we present top-down and bottom-up hierarchical clustering methods for large bipartite graphs. The top down approach employs a flow-based graph partitioning method, while the bottom up approach is a multiround hybrid of the single-link and average-link agglomerative clustering methods. We evaluate the quality of clusters obtained by these two methods using additional textual inform...
متن کاملAgglomerative connectivity constrained clustering for image segmentation
We consider the problem of clustering under the constraint that data points in the same cluster are connected according to a pre-existed graph. This constraint can be efficiently addressed by an agglomerative clustering approach, which we exploit to construct a new fully automatic segmentation algorithm for color photographs. For image segmentation, if the pixel grid with eight neighbor connect...
متن کامل